Instruction Fetch Energy Reduction Using Forward-Branch Bufferable Innermost Loop Buffer
نویسندگان
چکیده
Recently, several loop buffer designs have been proposed to reduce instruction fetch energy due to size and location advantage of loop buffer. Nevertheless, on design complexity dictates most loop buffer designs to store only innermost loops without forward branch or instructions within innermost loops before a forward branch. While program modeling shows that typical programs can best be represented with a simple loop model, many of then contain forward branches in their innermost loops. For example, MiBench spends 71% of execution time on innermost loops, and 27% of these innermost loops consist of forward branch(es). Hence, existing designs lead to limitation in reduction of instruction fetch energy. We propose a simple and effective way to cope with this complexity: since using BTB is a norm in most designs, if we add an extra bit in BTB, indicating if the loop buffer stores the fall-through or target trace after a within-the-innermost-loop forward branch, then much of the complexity can be avoided. Results with MiBench indicate that up to 14.1% of further reduction in instruction fetch energy, and only 1.8% hardware overhead in BTB, is introduced compared with the design without forward branch handling.
منابع مشابه
Block Based Fetch Engine for Superscalar Processors
The implementation of modern high performance computer is increasingly directed toward parallelism in the hardware. However, most of the current fetch units are limited to one branch prediction per cycle and therefore, can fetch no more than one basic block per cycle. While fetching a single basic block each cycle is sufficient for implementations that issue small number of instructions per cyc...
متن کاملIntegrated I-cache Way Predictor and Branch Target Buffer to Reduce Energy Consumption
In this paper, we present a Branch Target Buuer (BTB) design for energy savings in set-associative instruction caches. We extend the functionality of a BTB by caching way predictions in addition to branch target addresses. Way prediction and branch target prediction are done in parallel. Instruction cache energy savings are achieved by accessing one cache way if the way prediction for a fetch i...
متن کاملThe Precomputed Branch Architecture
Accurate instruction fetch and branch prediction is increasingly important on today’s superscalar architectures. Fetch prediction is the process of determining the next instruction to request from the memory subsystem. Branch prediction is the process of predicting the likely out-come of branch instructions. A branch target buffer (BTB) is often used to provide target addresses for taken branch...
متن کاملThe Basic Block Reassembling Instruction Stream Buffer with LWBTB for X86 ISA
The potential performance of superscalar processors can be exploited only when processor is fed with sufficient instruction bandwidth. The front-end units, the Instruction Stream Buffer (ISB) and the fetcher, are the key elements for achieving this goal. Current ISBs could not support instruction streaming beyond a basic block. In x86 processors, the split-line instruction problem worsens this ...
متن کاملUsing a serial cache for energy efficient instruction fetching
Computer Science Department, University of California, Los Angeles Department of Computer Science and Engineering, University of California, San Diego Abstract The design of a high performance fetch architecture can be challenging due to poor interconnect scaling and energy concerns. Way prediction has been presented as one means of scaling the fetch engine to shorter cycle times, while providi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006